WHAT IS CLAIMED IS: 



^ 1. A packet voice conferencing method comprising: 

concurrently receiving a first packet voice data stream from a first conferencing 
endpoint and a second packet voice data stream from a second conferencing endpoint; 

mapping the voice data from the first packet voice data stream to a first set of 
presentation mixing channels\in a manner that simulates that voice data as originating in a 
first sector of a presentation sound field; 

mapping the voice data from the second packet voice data stream to a second set of 
presentation mixing channels in a manner that simulates that voice data as originating in a 
second sector of a presentation sound\field, the second sector substantially non-overlapping 

the first sector; and \ 

\ 

mixing each channel from the first set of presentation mixing channels with the 
corresponding channel from the second set of presentation mixing channels to form a first set 
of mixed channels. 



2. The method of claim 1, further comprising, for a first packet voice data stream containing 
information from which voice directional information can be derived: 

deriving a voice arrival direction for the voipe data in the first packet voice data 

stream; 

dividing the first sector into at least two subsectors, each subsector corresponding to a 
range of voice arrival directions; and 

when mapping the voice data from the first packet voice data stream to the first set of 
presentation mixing channels, performing the mapping in asmanner that simulates that voice 
data as originating in the subsector of the presentation sound field corresponding to the voice 
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stream. 
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3. The method of claim 1, further comprising: 

receiving, concurrently with the first and second packet voice data streams, a third 
packet voice data stream from a third conferencing endpoint; 

mapping the voice daia from the third packet voice data stream to a third set of 
presentation mixing channels m a manner that simulates that voice data as originating in a 
third sector of a presentation souijd field, the third sector substantially non-overlapping the 
first and second sectors; 

mixing each channel from thd first set of presentation mixing channels with the 
corresponding channel from the third s\t of presentation mixing channels to form a second set 
of mixed channels; 

mixing each channel from the secoAd set of presentation mixing channels with the 
corresponding channel from the third set of presentation mixing channels to form a third set 
of mixed channels; and 

transmitting the first, second, and third s^ts of mixed channels respectively to the 
third, second, and first conferencing endpoints. 



20 4. The method of claim 3, further comprising establishing a control protocol with one of the 
first, second, and third conferencing endpoints, and accepting protocol messages from that 
conferencing endpoint s pecifying the extent jqfLthe first, second, and third sectors of the 
presentation sound field. 
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5. The method of claim 1, wherein mapping voice data to a set of presentation mixing 
channePs comprises a method selected from the group consisting of: 

spirting a voice data channel from the voice data into at least two voice data channels; 

changmg the relative delay of one voice data channel from the voice data with respect 
to another of the Voice data channels; 

changing thet relative phase of one voice data channel from the voice data with respect 
to another of the voiceydata channels; 

changing the relative amplitude of one voice data channel from the voice data with 
respect to another of the voice data channels; 

splitting a portion of one voice data channel from the voice data and adding that 
portion to another of the voice o^ta channels; and 

combinations thereof. 



The method of claim 1, further comprising pictorially displaying, on a graphical user 
' e fll5 interface, a representation of a sound field and representations of each conferencing endpoint 

I** 1 "' \ 

to a listener at one conferencing endpoint, allowing that listener to manipulate the interface in 
order to indicate desired locations of the conferencing endpoints within the sound field, and 
using the listener's manipulations to set the extent \f the sectors of the presentation sound 
field. 

20 

[6^-^7?) An apparatus comprising a computer-readable medium containing computer instructions 
that, when executed, cause a processor or multiple communicating processors to perform a 
method for packet voice conferencing, the method comprising: 

concurrently receiving a first packet voice data stream from a\first conferencing 
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endspoint and a second packet voice data stream from a second conferencing endpoint; 

Vnapping the voice data from the first packet voice data stream to a first set of 
presentation mixing channels in a manner that simulates that voice data as originating in a 
first sector of\ presentation sound field; 

mapping\he voice data from the second packet voice data stream to a second set of 
presentation mixing, channels in a manner that simulates that voice data as originating in a 
second sector of a presentation sound field, the second sector substantially non-overlapping 
the first sector; and \ 

mixing each channe\from the first set of presentation mixing channels with the 
corresponding channel from tn& second set of presentation mixing channels to form a first set 
of mixed channels. \ 

8. The apparatus of claim 7, the methocL further comprising: 

receiving, concurrently with the firsiand second packet voice data streams, a third 
packet voice data stream from a third conferencing endpoint; 

mapping the voice data from the third packet voice data stream to a third set of 
presentation mixing channels in a manner that simulates that voice data as originating in a 
third sector of a presentation sound field, the third sector substantially non-overlapping the 
first and second sectors; \ 

mixing each channel from the first set of presentation mixing channels with the 
corresponding channel from the third set of presentation mixing cnrninels to form a second set 
of mixed channels; \ 

mixing each channel from the second set of presentation mixing channels with the 
corresponding channel from the third set of presentation mixing channels to fofcm a third set 
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of mixed channels; and 

transmitting the first, second, and third sets of mixed channels respectively to the 
third, second, v ^nd first conferencing endpoints. 




m 



9. The apparatus of ctaim 8, the method further comprising establishing a control protocol 
with one of the first, second, and third conferencing endpoints, and accepting protocol 
messages from that conferencing endpoint specifying the extent of the first, second, and third 
sectors of the presentation sound lield. 

jl/LO. The apparatus of claim 8, the methoh^further comprising establishing a control protocol 
session with a user interface for a participant located at one of the conferencing endpoints, 
and accepting protocol messages from that use\interface specifying the division of the 
presentation sound field for that endpoint. 

1 1 . The apparatus of claim 10, the method further comprising establishing a control protocol 
session with a user interface for a participant located at eafch of the other conferencing 
endpoints, thereby allowing each endpoint to specify its own^division of the presentation 
sound field. 



ft 

20 12. The apparatus of claim 7, wherein mapping voice data to a set ofVpresentation mixing 
channels comprises a method selected from the group consisting of: 

splitting a voice data channel from the voice data into at least two vbice data channels; 

changing the relative delay of one voice data channel from the voice dat^ with respect 
tS"a no l h ci' of t he v u i ie- dala chan n els ; ~* 
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^hanging the relative phase of one voice data channel from the voice data with respect 
to another of the voice data channels; 

changing the relative amplitude of one voice data channel from the voice data with 
respect to another of the voice data channels; 

splitting a portion of one voice data channel from the voice data and adding that 
portion to another of th^voice data channels; and 

combinations thereof. 

13. The apparatus of claim 12, ^herein a mapping is performed on a subchannel basis. 



The apparatus of claim 7, the method further comprising, when voice data from one of the 
conferencing endpoints is received mon^urally, mapping the voice data into multiple voice 
data channels. 



20 



^15. The apparatus of claim 7, the method furthek comprising, when voice data from one of the 
conferencing endpoints comprises multiple voice data channels: 

measuring the relative delay between at least\^vo of the multiple channels; 

estimating, from the measured relative delay, the^ arrival direction of a voice signal 
present in the voice data; and 

accounting for the estimated arrival direction during gapping of the voice data into a 
set of presentation mixing channels. 



f\ 16. The apparatus of claim 7, the method further comprising, for a fim packet voice data 
stream containing information from which voice directional informationVan be derived: 
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aeriving a voice arrival direction for the voice data in the first packet voice data 
stream; \ 

dividing the first sector into at least two subsectors, each subsector corresponding to a 
range of voice arrrval directions; and 

when mapping, the voice data from the first packet voice data stream to the first set of 
presentation mixing channels, performing the mapping in a manner that simulates that voice 
data as originating in the subsector of the presentation sound field corresponding to the voice 
arrival direction angle presently derived for the voice data in the first packet voice data 
stream. 



. The apparatus of claim 7, the method further comprising pictorially displaying, on a 

! i \ 

"i; graphical user interface, a representation of a sound field and representations of each 

U: \ 

; conferencing endpoint to a listener at one conferencing endpoint, allowing that listener to 
m manipulate the interface in order to indicate desired locations of the conferencing endpoints 
=JJ 5 within the sound field, and using the listener's manipulations to set the extent of the sectors of 
the presentation sound field. 



\^ 18. The apparatus of claim 17, the method further comprising allowing the listener to divide a 
sector into subsectors, and to manipulate each subsector of rfrat sector within the presentation 
20 sound field independent of the other subsectors of that sector. 

!9. The apparatus of claim 17, wherein the graphical user interface f^her allows the listener 
to specify the number and locations of presentation channel acoustical speakers relative to 
that listener's position in a room, the method further comprising accountingyfor the number 
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and locations of presentation channel acoustical speakers in mapping voice data to 
presentation mixing channels. 




20. The appkfatus of claim 17, the method further comprising recurrently updating the 

\ 

graphical user interface with a visual indication of which endpoint or endpoints is/are 
currently transmitting voice data. 



21. The apparatus of claim 17, the method further comprising automatically dividing the 
presentation sound field intb sectors that allocate approximately equal shares of the 
presentation sound field to each endpoint. 



n 



22. The apparatus of claim 21, the method further comprising tracking the number of 
conferencing endpoints participating in ^conference, and automatically altering.the allocation 
of the presentation sound field as endpointsVare added to or leave the conference. 

23. The apparatus of claim 21, wherein a larger sector of the sound field is allocated to a 
conferencing endpoint that is broadcasting multiple \apture channels than is allocated to a 
conferencing endpoint that is broadcasting monaurally.^ 



20 24. A packet voice conferencing system comprising: 

means for concurrently receiving multiple packet voice dVta streams; 
means for manipulating the voice data in each of the packet Voice data streams in a 
manner that simulates that voice data as originating in a specified sector of a presentation 
sound field, the sectors arranged in the sound field in substantially non-overlapping fashion; 
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\ means for combining the manipulated voice data from each packet voice data stream 
into a set of presentation channels. 

\ 

5 25. The packet v&ice conferencing system of claim 24, further comprising means for 

specifying the sectoiSpf the presentation sound field to be applied to each packet voice data 
stream. 



c£,k \ iA 

26, The packet voice conferencing system of claim , further comprising means for varying the 

ten \ 

^ vJJO specified sector of the presentation sound field for a packet voice data stream depending on a 




;5 voice arrival direction derived for that packet voice data stream. 

js* 

\ \& yjft 27. The packet voice conferencing system of claim 24, incorporated into one of the 

jjl conferencing endpoints. 

0 

s \ 

^ D V ^8. A P ac ket voice conferencing system comprising: 

first and second decoders, to respectively decode first and second packet voice data 
streams and produce first and second sets of one or more voice data channels from the voice 
data packets contained in the streams; 
20 a packet switch to receive packet voice data streams N sent to the system by first and 

second conferencing endpoints and to distribute the packet voioe data stream received from 
the first conferencing endpoint to the first decoder and the packet \oice data stream received 
from the second conferencing endpoint to the second decoder; 

a first channel mapper to map the first set of voice data channelsNto a first set of 
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presentation mixing channels in a manner that simulates the voice data as originating in a first 
sector of a presentation sound field; 

V a second channel mapper to map the second set of voice data channels to a second set 
of presentation mixing channels in a manner that simulates the voice data as originating in a 
second s^tor of a presentation sound field, the second sector substantially non-overlapping 
the first sector; and 



a first stet^of mixers, each mixer combining one of the first set of presentation mixing 
channels with a corresponding one of the second set of presentation mixing channels to form 
a mixed channel, the^set of mixers collectively forming a first set of mixed channels. 



mo 



yt> 29. The packet voice conferencing system of claim 28, further comprising: 



|=2: 



a third decoder to decode^a third packet voice data stream and produce a third set of 
one or more voice data channels from the voice data packets contained in the third stream, the 



packet switch receiving the third packet voice data stream from a third conferencing endpoint 
ft 5 and distributing the third packet voice data stream to the third decoder; 



S=JJ 



^ a third channel mapper to map the third set of voice data channels to a third set of 

presentation mixing channels in a manner t^\simulates the voice data as originating in a 
third sector of a presentation sound field, the third x sector substantially non-overlapping the 
first and second sectors; 

20 a second set of mixers, each mixer in the second^set combining one of the first set of 

\ 

presentation mixing channels with a corresponding one of the^third set of presentation mixing 
channels to form a mixed channel, the second set of mixers collectively forming a second set 
of mixed channels; 

a third set of mixers, each mixer in the third set combining one ^xf the second set of 
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presentation mixing channels with a corresponding one of the third set of presentation mixing 
channels to form a mixed channel, the third set of mixers collectively forming a third set of 
mixed channels; and 

a transmitter to dispatch the first, second, and third sets of mixed channels 
respectively to tnfe third, second, and first conferencing endpoints. 




^0 
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30. The packet voice conferencing system of claim 29, further comprising a controller 
connected to each channel Viapper, the controller configuring each channel mapper according 
to its designated sound field sector. 

31. The packet voice conferencing system of claim 30, wherein the controller communicates 
with one of the first, second, and thirdVonferencing endpoints using a control protocol and 
accepts protocol messages from that conferencing endpoint specifying the extent of the first, 
second, and third sectors of the presentation\ound field. 



32. The packet voice conferencing system of claini\28, further comprising a jitter buffer for 
each voice data channel, each jitter buffer delaying it£\respective voice data channel prior to 
submission to a mapper. 

20^ 33. The packet voice conferencing system of claim 32, furtherVomprising a controller 
connected to the jitter buffers to synchronize the relative delays o\multiple jitter buffers 
associated with a common mixed channel. 



The packet voice conferencing system of claim 28, further comprising s^graphical user 
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interface driver to create a display for a listener and manipulate that display in response to 
listener inputs, the display including a representation of a sound field and representations of 
each conferencing endpoint, the driver using listener inputs to set the extent of the sectors of 
the presenilation sound field. 




jcf^ 35. The packet vbice conferencing system of claim 34, wherein the graphical user interface 
driver allows the listener to divide a sector into subsectors, and to manipulate each subsector 
of that sector within tn^ presentation sound field independent of the other subsectors of that 
sector. 
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I 36. A packet voice conferencing system comprising: 

a decoder, to decode a packet voice data stream to produce a set of one or more voice 
data channels from the voice data packets contained in the streams and a voice arrival 
direction corresponding to the set of voice data channels; 

a controller to select one of a plurality of presentation sound field subsectors for the 

i \ 

fi \ 

t voice data channels based on the voice arrival^irection, each subsector corresponding to a 

range of voice arrival directions; and 

a channel mapper to map the set of voice data channels to a set of presentation 

channels in a manner that simulates the voice data as originating in the selected subsector of 
20 the presentation sound field. 



37. The packet voice conferencing system of claim 36, whereinthe voice arrival direction is 
--e*pitcTtry~commuriicated in the packet voice data stream: 
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38. Tn^packet voice conferencing system of claim 36, wherein the set of voice data channels 
comprises tWo or more channels, and wherein the decoder comprises a direction finder to 
estimate the voic^arrival direction by comparing at least one of the voice data channels to 
another of the voice data channels. 
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39. A packet voice conferencing system having one or more local audio capture channels, the 
system comprising: 

a controller to negotiate with other jacket voice conferencing systems connected in a 
common conference, wherein the results of a negotiation include a codec to be used by the 
system for encoding the local audio capture channels^and a presentation sound field sector 
allocated to the local audio capture channels; 

a channel mapper to map the local audio capture chanJ*els to a set of presentation 
mixing channels in a manner that simulates the audio data on the capture channels as 
originating in the allocated presentation sound field sector; and 

an encoder to encode the presentation mixing channels into a packet \^ce data 
stfearn: ' 



20 
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